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Abstract 

Background: Beef carcass conformation and fat cover scores are measured by subjective grading performed by 
trained technicians. The discrete nature of these scores is taken into account in genetic evaluations using a threshold 
model, which assumes an underlying continuous distribution called liability that can be modelled by different methods. 

Methods: Five threshold models were compared in this study: three threshold linear models, one including 
slaughterhouse and sex effects, along with other systematic effects, with homogeneous thresholds and two 
extensions with heterogeneous thresholds that vary across slaughterhouses and across slaughterhouse and sex and 
a generalised linear model with reverse extreme value errors. For this last model, the underlying variable followed 
a Weibull distribution and was both a log-linear model and a grouped data model. The fifth model was an 
extension of grouped data models with score-dependent effects in order to allow for heterogeneous thresholds 
that vary across slaughterhouse and sex. Goodness-of-fit of these models was tested using the bootstrap 
methodology. Field data included 2,539 carcasses of the Bruna dels Pirineus beef cattle breed. 

Results: Differences in carcass conformation and fat cover scores among slaughterhouses could not be totally 
captured by a systematic slaughterhouse effect, as fitted in the threshold linear model with homogeneous 
thresholds, and different thresholds per slaughterhouse were estimated using a slaughterhouse-specific threshold 
model. This model fixed most of the deficiencies when stratification by slaughterhouse was done, but it still failed 
to correctly fit frequencies stratified by sex, especially for fat cover, as 5 of the 8 current percentages were not 
included within the bootstrap interval. This indicates that scoring varied with sex and a specific sex per 
slaughterhouse threshold linear model should be used in order to guarantee the goodness-of-fit of the genetic 
evaluation model. This was also observed in grouped data models that avoided fitting deficiencies when 
slaughterhouse and sex effects were score-dependent. 

Conclusions: Both threshold linear models and grouped data models can guarantee the goodness-of-fit of the 
genetic evaluation for carcass conformation and fat cover, but our results highlight the need for specific thresholds 
by sex and slaughterhouse in order to avoid fitting deficiencies. 



Background 

Beef cattle production is becoming increasingly 
concerned with meat and carcass quality traits [1]. Cur- 
rently, beef cattle genetic evaluations include mainly 
growth traits, but carcass traits are also economically 
important [2], European beef producers are paid based 
on the weight of the animals at slaughter and on carcass 
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conformation (CON) and fat cover (FAT) scores. All 
carcasses are classified at commercial slaughterhouses 
according to CON and FAT scores measured by subjec- 
tive grading performed by trained technicians. These 
subjective records usually involve classification under a 
categorical and arbitrarily predefined scale, which may 
lead to strong departures from the Gaussian distribu- 
tion. Theoretically, the discrete nature of performance 
traits is taken into account in genetic evaluations using 
a threshold linear model [3], which assumes an underly- 
ing continuous distribution called liability. This model 
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includes thresholds that link the underlying distribution 
with the observed scale. However, in some cases, differ- 
ent technicians may use different intervals on the cate- 
gorical scale, or a wider or narrower range of values for 
the subjective grading. Thus, the link between the 
observed scale and the liability scale could be specific to 
each technician. In 2006, Varona and Hernandez [4] 
proposed a specific ordered category threshold linear 
model for sensory data and concluded that each panelist 
used a different pattern of categorization. In 2009, Var- 
ona et al. [5] compared different threshold linear models 
using the deviance information criterion and showed 
that the most plausible model to analyse carcass traits 
was the slaughterhouse specific ordered category thresh- 
old linear model. This result was confirmed by the fact 
that the threshold estimates differed notably between 
slaughterhouses. 

Liability may follow many distributions, such as the 
Gaussian distribution (probit model), the logistic 
distribution (logit model) or the reverse extreme value 
distribution. This latter distribution is a log-Weibull 
distribution and the resulting model can therefore be 
framed as a linear model for the logarithm of the liability 
The Weibull distribution (including the exponential dis- 
tribution as a special case) is commonly used in survival 
analysis and it can be parameterised as either a propor- 
tional hazards model or a log-linear model. It is the only 
family of distributions that has this property [6]. Whereas 
a proportional hazards model assumes that the effect of a 
covariate is to multiply the hazard by some constant, a 
log-linear model assumes that the effect of a covariate is 
to multiply the underlying variable by some constant [6] . 
The results of fitting a Weibull model can therefore be 
interpreted in both frameworks. 

Prentice and Gloeckler [7] presented the "grouped data 
model" for analysis of discrete data while maintaining the 
assumption of proportional hazards. Ducrocq [8] repara- 
meterized and extended grouped data models to include 
random effects for animal breeding applications. Tarres et 
al. [9] showed that Ducrocq's formulae [8], drawn from 
the grouped data model for survival analysis (where the 
value of the underlying variable is necessarily larger than 
0), can be applied to an underlying variable with negative 
values. They also highlighted the flexibility of the grouped 
data model for the analysis of discrete traits, such as cal- 
ving ease of beef calves, in comparison to homoscedastic 
and heteroscedastic threshold linear models. 

Given the diversity of models to analyse discrete 
variables such as CON and FAT scores, comparing these 
models requires specific tools to test goodness-of-fit with 
real data. Bootstrap approaches, introduced by Efron 
[10], have become routine methods to approximate the 
distribution of a parameter of interest, and have been 
applied to the animal breeding framework [11,12]. In 



2006, Casellas et al. [13] proposed a parametric bootstrap 
procedure to test goodness-of-fit that provides a clear 
framework to compare predicted and actual distributions 
of variables of interest. Significant fitting deficiencies are 
revealed when the distribution of the actual data is not 
included within the bootstrap interval. This bootstrap 
approach could be a very useful tool to validate models 
by direct assessment of the ability of the model to fit the 
actual data. 

The aim of this work was to perform a parametric 
bootstrap procedure to test the goodness-of-fit of three 
threshold linear models, a threshold log-linear Weibull 
model, and a grouped data model for the analysis of car- 
cass conformation and fat cover in beef cattle. The three 
threshold linear models were a model with slaughter- 
house and sex effects, along with other systematic 
effects, with homogeneous thresholds, and two exten- 
sions with heterogeneous thresholds that vary across 
slaughterhouses and across slaughterhouse and sex. 

Methods 

Data 

Bruna dels Pirineus is a beef type breed selected from 
the old Brown Swiss (derived from the Canton Schwyz) 
with herds located in the Pyrenean mountain areas of 
Catalonia (Spain). From October/November to June, 
when most of the calving occurs, the animals remain in 
the valleys close to the villages and then the cows and 
calves are taken to the mountains to graze alpine pas- 
tures. After weaning, calves are fattened by ad libitum 
feeding with barley-corn concentrate meal and straw. 
Data were recorded between 2004 and 2009 in 12 
slaughterhouses located in Catalonia (Spain), and 
included records from 2,539 beef carcasses from animals 
participating in the Yield Recording Scheme of the 
breed. Two traits were analysed in this study: the CON 
score, which describes the development of essential 
parts of the carcass profile according to the (S)EUROP 
scale (CEE no 2930/81, 1981), and the FAT score, which 
quantifies the amount of fat on the outside of the car- 
cass and in the thoracic cavity. The categorical scale of 
CON was converted to a numeric scale from 2.00 (O) to 
5.00 (E) because S and P scores were not observed. 
Similarly, FAT could have scores between 1 and 5, but 
scores over 4 were not observed. The percentages of 
each score in each slaughterhouse are presented in 
Tables 1 and 2. The data were completed with pedigree 
records provided by the Bruna dels Pirineus Breeders 
Association (FEBRUPI). Both FEBRUPI and slaughter- 
house databases were merged according to the European 
animal identification code. The pedigree file contained 
5,153 animals related to these calves, of which 332 were 
sires. Statistical analysis of these data was performed 
with different threshold models. 
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Table 1 Percentages of carcass conformation stratified by slaughterhouse 
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Bootstrap confidence intervals (95%) in parentheses, and p-values from a threshold linear model (TLM). Percentage outside the bootstrap interval if * (P < 0.05); 



(P < 0.01); *** (P < 0.001) 



Threshold Linear animal Model (TLM) 

Each CON and FAT score was modelled as a discrete 
variable Y conditional to an unobservable underlying 
continuous variable T, referred to as liability. 
The probability that the discrete variable Y has a value 
k is: 

p {Y = k}=P{T k _ l <T<r k }, 

where z 1} r 2 and r 3 are thresholds that define the four 
categories of response. The prior distributions of the 
threshold positions were assumed to be flat. Thresholds 
r 2 and r 3 are assumed to be known, i.e. arbitrarily fixed 
to 0 and 2.0 for CON and FAT, to provide a simpler 
sampling scheme than the one defined by fixing the 
mean and the residual variance of the liability [14]. The 
posterior conditional distributions for the augmented 
underlying variables are censored normal distributions, 
as described by Sorensen et al. [15]. 

The underlying variable T had the following distribu- 
tion: 

T - N(XP+ Zih + Z 2 u,Itr 2 ) , 



where P are the regression coefficients of the systematic 
effects, h are herd effects, u direct breeding values, X, Z lt 
and Z 2 are incidence matrices linking data with 
their respective effects, and cr 2 is the residual 
variance. The systematic effects included in p, i.e. 

P = [ P sh P sex P parity P age P' season P 'year \ referred to 

slaughterhouse (12 levels), sex (males and females), parity 
(1st to 4 th or more), age at slaughter (6 levels: 9 to 14 
months), season at slaughter (winter, spring, summer and 
autumn) and year of slaughter (2005 to 2009). Prior 
distribution for herd effects (73 levels) was assumed to be 
multivariate normal 

/(h) ~N (0,1a, 2 ), 

where a£ is the herd variance. For direct breeding 
values, the prior distribution was: 

/(u)~N(0,Aa 2 ), 

where A is the numerator relationship matrix and cr 2 
is the additive genetic variance. The prior distributions 
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Table 2 Percentages of fat cover stratified by slaughterhouse 



Slaughterhouse Fat cover 
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for systematic effects and the (co)variance components 
were bounded flat uniform distributions. 

Bayesian analysis of the Threshold Linear Model 
(TLM) was carried out with the Gibbs sampler algo- 
rithm implemented in Varona et al. [5]. Each analysis 
consisted of a single chain of 100,000 iterations, with 
the first 25,000 samples discarded. Analysis of conver- 
gence and calculation of effective sample size followed 
the algorithms by Raftery and Lewis [16]. All iterations 
in the analysis were used to compute posterior means 
and standard deviations of estimated regression coeffi- 
cients and random effects, so that all available infor- 
mation from the output of the Gibbs sampler could be 
considered. 

Specific Slaughterhouse Threshold Linear animal 
Model (SHTLM) 

This model is the same as above, except that it 
estimates a specific set of thresholds for each slaughter- 
house. Now, the probability that the discrete variable 
Y takes a value k is: 



where z sh)1 , z sh)2 and z sh)3 are thresholds that define the 
four categories of response and have a different position 
depending on the slaughterhouse (12 different slaughter- 
houses). As in the previous model, the prior distribu- 
tions of the threshold positions are assumed to be flat, 
and thresholds r 12 ,2 and r 12 ,3 are assumed to be known 
and arbitrarily fixed to 0 and 2.0 for both traits. The 
presence of specific thresholds for each slaughterhouse 
should take into account the variation captured by the 
slaughterhouse effect in TLM. Thus, in this model, sys- 
tematic effects were reduced to sex, parity, age at 
slaughter, season and year at slaughter. Once again, a 
Bayesian analysis was carried out with the Gibbs sam- 
pler algorithm implemented as in Varona et al. [5]. 

Specific Sex per Slaughterhouse Threshold Linear animal 
Model (SEXTLM) 

This model differs from the previous ones in that it esti- 
mates a specific set of thresholds for each sex in each 
slaughterhouse. Now, the probability that the discrete 
variable Y takes a value k is: 



P{Y = k} = P{r shik - 1 <T<r shik }, 



P{Y = k} = P {TsexM-l < T < r S ex,sh,k} , 
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where T sex>sh}l1 T sex>sh}2 and T seX)Sh>3 are thresholds that 
define the four categories of response and have a dif- 
ferent position depending on the interaction of sex and 
slaughterhouse (24 levels). As in the previous model, 
the prior distributions of the threshold positions are 
assumed to be flat, and thresholds r ma i e> i 2>2 and 
T maie,i2,3 are assumed to be known and fixed to 0 and 
2.0 for both traits. The presence of specific thresholds 
for each sex in each slaughterhouse should take into 
account the variation captured by the sex effect in 
SHTLM. Thus, in this model, systematic effects were 
reduced to parity, age at slaughter, season and year at 
slaughter. Once again, a Bayesian analysis was carried 
out with the Gibbs sampler algorithm implemented in 
Varona et al. [5]. 

Threshold log Linear Weibull Model (TlogLWM) 

In the previous models, CON and FAT scores were 
modelled as a discrete variable Y conditional to an 
unobservable underlying continuous variable I 7 , referred 
to as liability that follows a linear model. In the 
TlogLWM, we assume that the liability is modelled as 
follows: 

t = t 0 exp (-Xp - Zih - Z 2 u) 

where t 0 follows a standard Weibull distribution. In 
this case, this model is equivalent to: 

— p logt = — p log A. + Xp + Zih + Z 2 u + e 

where e follows an extreme value distribution [17], 
and p and A are the Weibull parameters, P are 
the regression coefficients of the systematic effects, h 
are herd effects, u are breeding values, and X, Z lf 
and Z 2 are incidence matrices linking data with their 
respective effects. The systematic effects included in p, 

i.e. P = [ P' s h P' sex P parity P age P' season P ' year\ were the 

same as in TLM. Here it is important to note the minus 
sign in front of the effects because it influences the 
interpretation of the results. 

The probability that the discrete variable Y has a value 
k is: 

P{Y = k} =P{z k - 1 <T<z k } = (l- a k )Y\a jr 

j<k 

where z 1} z 2 and r 3 are homogeneous thresholds that 
define the four categories of response and 



au = exp 



/ 



h(t)dt 



with h{.) being the underlying 



L Tfe_i 

hazard function that is the ratio of the probability density 
function to the complementary cumulative distribution 



function [8]. This hazard function follows a proportional 
hazard model h(t) = /z 0 (^)exp(XP+Z 1 h + Z 2 u) with /z 0 (.) 
being the baseline Weibull hazard function. 

In our data, each CON and FAT score can take four 
values k = 1, 2, 3 or 4. Then, the probability that the 
discrete variable Y has a value k was calculated as: 

P{Y =l} = (l-a 1 ) 
P{Y = 2}=a l (l-a 2 ) 
P{Y = 3} = a x a 2 (l - a 3 ) 

P{Y = 4} = ai a 2 a 3 

Because a k can by definition only take values between 
0 and 1, it was modelled using a log-log transformation 
as: 

ai = exp { — exp(/xi +XP + Zih + Z 2 u)} 
a 2 = exp { — exp(/x 2 +XP + Zih + Z 2 u)} 

a 3 = exp { — exp(/X3 +XP + Z x h + Z 2 u)} 

where [tj, /u 2 and [i 3 were mean values ranging from 
-oo to These means were different for each k value 
of CON and FAT while systematic effects P, herd effects 
h and breeding values u were the same for all the k 
values 

The Survival Kit package [18] was used to analyse the 
TlogLWM model because the likelihood expression was 
exactly the same as assuming an underlying variable 
T with a threshold proportional hazard model [8]. In 
fact, TlogLWM is a particular case of a threshold 
proportional hazard model with a baseline Weibull 
distribution. 

Grouped Data Model (GDM) 

The threshold proportional hazard models are called 
grouped data models [8]. In these models, the discrete 
variables Y are modelled conditional to an unobservable 
liability that follows a proportional hazard model. In this 
case, the hazard function of the liability h{t) = h 0 (t)exp 
(XP + Z x h + Z 2 u) is the product of two terms, the 
baseline hazard function /z 0 (.) and the regression coeffi- 
cients term. Unlike in the previous model, in GDM the 
baseline distribution of the underlying variable T can be 
unknown and not necessarily Weibull, because the esti- 
mates of regression coefficients, herd and genetic effects 
will be exactly the same regardless of the distribution 
assumed. 
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The probability that the discrete variable Y has a value 
k was calculated as before: 

P{Y = k} =P {t S ex,sh,k-l < T < T S ex,sh,k} = 
= (1 -«fe) ]"[«;/ 

where T seXtShflt T sex>sh)2 and T sex>sh>3 are heterogeneous 
thresholds that vary by slaughterhouse and sex and 
define the four categories of response, and a k was mod- 
elled using a log-log transformation as: 



a\ = exp 



a 2 = exp 



a 3 = exp 



exp 



exp 



exp 



Ml + ^shPsh,l + X S exPsex,l + | 

+X)8 + Zih + Z 2 u J 

/X 2 + X s hP sh2 + XsexPsex,2+ | 

+X)8 + Zih + Z 2 u J 

/x 3 + X s hP sh3 + X sex P seXi3 + I 
+X)8 + Zih + Z 2 u I 



where ^i, ^ 2 and were mean values ranging from 
-oo to +oo. In our study, the variables included in P were 
the systematic effects with incidence matrix X, 
i.e. = [P' party P' age F season P'year] On the one hand, 
these regression coefficients were the same for all values 
k of CON and FAT. On the other hand, the slaughter- 
house and sex effects were assumed to be score-depen- 
dent, i.e. different for each value k of CON and FAT 
scores. Likelihood ratio tests determined whether 
including score-dependent effects for these factors gave 
a significantly better fit. Herd effects h and breeding 
values u were assumed to be random with incidence 
matrices Z x and Z 2 that link data with their respective 
effects. Prior distributions for herd effects and genetic 
effects were chosen as in the previous models. The 
Survival Kit package [18] was used for the analysis of 
the GDM model. 

It is important to note here that the heterogeneous 
threshold positions do not appear in the likelihood 
expression and therefore they are not estimated. How- 
ever, they can be calculated a posteriori by assuming a 
known distribution and solving In a k = In S(z seX)Sh)k ) - In 
S( z sex,sh,k-i) where 5(.) is the complementary cumulative 
distribution function of the liability. In this way, a direct 
relationship can be established between score-dependent 
effects and heterogeneous thresholds positions. 

Parametric bootstrapping for model comparison 

A parametric bootstrap approach was applied to test the 
goodness-of-fit of the described models in the analysis of 
CON and FAT scores. The bootstrapping methodology 



was the same as in Tarres et al. [9]. Confidence intervals 
obtained for the frequency of each k value of CON and 
FAT were stated as being the 0.025 and 0.975 percentiles 
of the bootstrap samples, and they were easily contrasted 
with the frequencies of the actual data. Significant fitting 
deficiencies were revealed when the actual frequencies 
were outside the confidence interval for one model, and 
they could be statistically quantified through the 
bootstrapped p- values [19]. 

Results 

Descriptive statistics 

The average carcass of the Bruna dels Pirineus breed 
under commercial conditions weighed around 279 kg at 
12.5 months of age (377 d), with an average CON score 
of 3.43, between R (good) and U (very good), and a low 
FAT average score (2.48). Male calves were slaughtered 
one month later than females (387 d vs. 360 d) and had 
a higher cold carcass weight (305 kg vs. 231 kg) and 
CON score (3.61 vs 3.35) but a slightly lower FAT aver- 
age (2.47 vs 2.54) (results not shown in tables). These 
results show that under commercial conditions the 
Bruna dels Pirineus and the Pirenaica breeds have simi- 
lar performances [20], which are also similar to those 
previously reported for the same breeds under an 
experimental environment by Piedrafita et al. [21]. In 
addition, the Bruna dels Pirineus breed results were 
comparable to those from other European populations 
scored by the EUROP carcass classification system, such 
as the Swedish Charolais and Simmental populations 
studied by Eriksson et al. [1], but with a higher CON 
score and a smaller FAT score than the Irish popula- 
tions studied by Hickey et al. [2]. 

Threshold Linear animal Model (TLM) 

A standard alternative for analysis of categorical data 
such as CON and FAT scores is the threshold linear 
model or TLM [3-5]. Using TLM, sex, parity and age at 
slaughter effects reflected the expected physiological 
relationship among them (results not shown). Males 
showed larger CON scores than females, which is very 
similar to results of Altarriba et al. [20]. The situation 
was reversed for FAT, since females showed a higher 
FAT score than males, due to their greater precocity 
[22]. Calves from multiparous dams had higher CON 
scores than calves from primiparous dams, but these dif- 
ferences were not so large for FAT scores. Moreover, for 
the effect of age at slaughter, an almost linear increasing 
relationship was observed for CON scores (results not 
shown) but for FAT scores no clear tendency was 
detected. The difference in precocity among sexes did 
not generate a different effect of age at slaughter on 
FAT score between sexes because this interaction was 
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not significant in our data. Finally, significant differences 
in CON and FAT scores were detected depending on 
the season and year of slaughter but there was no clear 
trend over time. 

These estimated regression coefficients were used to 
compute the bootstrap intervals for TLM. Significant 
fitting deficiencies were revealed because in many cases 
the actual frequency of CON and FAT scores was not 
within the bootstrap interval, especially when stratifying 
by slaughterhouse (Tables 1 and 2). This was because 
CON and FAT score frequencies varied significantly 
between slaughterhouses. For two slaughterhouses (11 
and 12), over 80% of the carcasses were qualified as R 
for CON, whereas in the other slaughterhouses most of 
the carcasses were qualified as U (Table 1). In the case 
of FAT scores, several slaughterhouses (1, 3, 4, 5, 8, 9 
and 12) qualified most carcasses with a value of 3, while 
in some slaughterhouses (2, 6, 7 and 10) the most fre- 
quent value was 2, and in one slaughterhouse (11) the 
most frequent value was 1 (Table 2). These differences 
among slaughterhouses can be explained either by the 
fact that some slaughterhouses prefer to slaughter light 
young animals (i.e less than one year old) compared to 
other slaughterhouses, or by the fact that both traits 
were scored by different technicians in each slaughter- 
house. Despite the existence of an objective European 
scoring system, each technician may have a different 
subjective interpretation (i.e. each technician puts the 
threshold at a different position). As in Varona et al. [5], 
this fact reveals the complexity of the normalization of 
carcass evaluation for CON and FAT scores, which can- 
not be accommodated by the TLM because it suffers 
from low flexibility due to the assumptions made in the 
model (i.e. all the slaughterhouses have the same thresh- 
old position). 

Specific Slaughterhouse Threshold Linear animal Model 
(SHTLM) 

The flexibility of threshold models was improved in 
SHTLM by estimating different thresholds per slaugh- 
terhouse in order to take the different subjective inter- 
pretations of scoring systems into account. The 
posterior means for the thresholds indicated a large var- 
iation among slaughterhouses (results not shown), in 
strong concordance with the heterogeneity of the raw 
data presented in Tables 1 and 2. Threshold position 
T sh,3 was negative for slaughterhouses in which most car- 
casses were qualified as U for CON and positive for 
slaughterhouses in which most carcasses were qualified 
as R. For FAT, the threshold position T shA was positive 
for slaughterhouse 11, in which most carcasses were 
qualified as 1 (69.57%), and the threshold position z sh}2 
was over 0.45 for slaughterhouses (2, 6, 7 and 10) in 
which most carcasses were qualified as 2. Using 



SHTLM, most of the fitting deficiencies when stratifying 
by slaughterhouse disappeared, as most of the frequen- 
cies of CON and FAT scores from actual data fell within 
the bootstrap intervals (results not shown). However, 
SHTLM still failed to correctly fit the frequencies by sex 
(Tables 3 and 4), especially for FAT score, since five of 
the eight actual percentages in Table 4 were not within 
the bootstrap interval. This fact indicates that the 
threshold positions for FAT scores differed by sex and 
that differences among sexes could not be totally cap- 
tured by a systematic effect, as fitted in SHTLM. 

Specific Sex per Slaughterhouse Threshold Linear animal 
Model (SEXTLM) 

The flexibility of threshold models was improved in 
SEXTLM by estimating different thresholds per sex in 
each slaughterhouse in order to take the different sub- 
jective interpretations of scoring systems by sex into 
account. Using SEXTLM, the frequencies of CON and 
FAT scores by sex were always within the boostrapped 
boundaries (Tables 3 and 4) and no fitting deficiencies 
were detected. This fact confirmed that the interpreta- 
tion of the scoring system was different for each sex in 
each slaughterhouse. 

Threshold log Linear Weibull Model (TlogLWM) 

This model assumed proportional (log-linear) effects on 
CON and FAT scores, instead of the additive effects 
assumed in the threshold linear models, but again 
slaughterhouse, sex, parity, age at slaughter, season and 
year had a significant effect on CON and FAT scores. 

Table 3 Percentages of carcass conformation 
stratified by sex 



SEX Carcass conformation 





0 


R 


U 


E 


Males 


0.25 


49.88 


43.89 


5.99 


TLM 


(0.00-0.28) 


(49.53-53.21) 


(40.99-45.07) 


(4.52-6.48) 


SHTLM 


(0.00-0.22) * 


(49.45-52.88) 


(41.29-45.04) 


(4.64-6.58) 


SEXTLM 


(0.00-0.28) 


(49.34-52.81) 


(41 .08-44.76) 


(4.89-6.92) 


TlogLWM 


(0.00-0.64) 


(49.50-53.26) 


(41.12-45.17) 


(4.40-6.37) 


GDM 


(0.03-0.56) 


(49.47-53.30) 


(41.24-45.29) 


(4.18-6.02) 


Females 


0.96 


72.73 


24.17 


2.14 


TLM 


(0.16-1.18) 


(72.03-76.52) 


(21.87-26.47) 


(0.43-1.63) ** 


SHTLM 


(0.11-0.96) 


(71.39-75.78) 


(22.78-27.11) 


(0.43-1.60) ** 


SEXTLM 


(0.16-0.96) 


(72.09-76.41) 


(21.55-25.94) 


(0.91-2.14) 


TlogLWM 


(0.18-1.11) 


(72.05-76.53) 


(22.02-27.47) 


(0.45-1.62) ** 


GDM 


(0.37-1.60) 


(71.18-75.67) 


(21.76-26.26) 


(0.86-2.38) 



Bootstrap confidence intervals (95%) in parentheses, and p-values from a 
threshold linear model (TLM), a specific slaughterhouse threshold linear model 
(SHTLM), a specific sex per slaughterhouse threshold linear model (SEXTLM), 
a threshold log linear Weibull model (TlogLWM), and a grouped data model 
(GDM). 

Percentage outside the bootstrap interval if * (P < 0.05); ** (P < 0.01); 
*** (P < 0.001). 
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Table 4 Percentages of fat cover values stratified by sex 



SEX 




FAT 






1 


2 


3 


4 


Males 


11.80 


29.79 


57.92 


0.50 


TLM 


(12.42-14.73) 


(23.31-27.52) 


(58.58-62.29) 

x* 


(0.21-0.99) 


SHTLM 


(12.05-14.03) 

■*# 


(25.74-29.70) * 


(56.60-60.27) 


(0.37-1.36) 


SEXTLM 


(10.48-12.62) 


(28.30-32.30) 


(55.52-59.20) 


(0.33-1.28) 


TlogLWM 


(12.21-14.56) 


(23.56-27.79) 


(58.34-62.01) 

■*# 


(0.23-1.00) 


GDM 


(11.01-13.16) 


(28.03-32.10) 


(55.65-59.24) 


(0.12-0.74) 


Females 


19.30 


17.73 


59.45 


3.52 


TLM 


(14.41-18.06) 

■*# 


(19.56-24.45) ** 


(57.69-61.77) 


(1.11-3.06) 

■*# 


SHTLM 


(15.58-19.04) 


(18.25-23.08) ** 


(57.56-61.86) 


(1.43-3.32) 

•** 


SEXTLM 


(18.19-21.51) 


(14.66-19.04) 


(58.47-62.38) 


(1.89-3.98) 


TlogLWM 


(14.93-18.52) 


(18.99-23.67) ** 


(57.55-61.67) 


(1.22-3.15) 


GDM 


(18.25-21.84) 


(16.17-20.93) 


(56.45-60.82) 


(1.83-3.85) 



Bootstrap confidence intervals (95%) in parentheses, and p-values from a 
threshold linear model (TLM), a specific slaughterhouse threshold linear model 
(SHTLM), a specific sex per slaughterhouse threshold linear model (SEXTLM), 
a threshold log linear Weibull model (TlogLWM), and a grouped data model 
(GDM). 

Percentage outside the bootstrap interval if * (P < 0.05); ** (P < 0.01); 
*** (P < 0.001). 

Male calves had a CON score 1.08 times higher than 
females, but females had a FAT score 1.03 times higher 
than males. Calves from multiparous dams had a CON 
score 1.08 times higher than calves from primiparous 
dams, and calves slaughtered over 14 months of age had 
a CON score 1.16 times higher than calves slaughtered 
before 9 months of age. In spite of the fact that these 
effects reflect the expected physiological relationship 
with CON and FAT scores, in the bootstrap analysis, 
TlogLWM failed to correctly fit the frequencies when 
stratifying by slaughterhouse and sex, especially for FAT 
(Tables 1 and 2). This fact again indicates that differ- 
ences in CON and FAT scores among slaughterhouses 
and sexes could not be totally captured by a systematic 
effect, as fitted in TlogLWM, and heterogeneous thresh- 
olds should be allowed for sex and slaughterhouse 
effects. 

Grouped Data Model (GDM) 

The previous model TlogLWM is a particular case of a 
grouped data model with a baseline Weibull distribu- 
tion. Its fitting deficiencies can be solved in GDM by 
assuming that slaughterhouse and sex effects are score- 
dependent. Likelihood ratio tests confirmed this fact and 
showed that slaughterhouse and sex effects were signifi- 
cantly score-dependent, especially for FAT score (P < 
0.001). Again, this fact reveals the complexity of 



normalising carcass evaluations for CON and FAT 
among slaughterhouses and sexes. In the bootstrap 
analysis, fitting deficiencies were not observed using 
GDM, as the frequencies of both traits when stratifying 
by each factor were always within the bootstrapped 
boundaries (Tables 3 and 4 for sex, and results not 
shown for the other factors). Including score-dependent 
effects gave great flexibility to GDM [9], and is similar 
to assume different thresholds positions by slaughter- 
house and sex in threshold linear models, i.e. estimating 
one parameter for each score. Thus, this is a useful way 
to improve the goodness-of-fit of the models with a 
small increase in the number of parameters to be 
estimated, since there were only four scores. 

Heritabilities and EBV correlations among models 

Estimates of variance components for the two traits are 
presented in Table 5. In this study, only slight differ- 
ences in terms of variance components were noted 
among models (except for a h 2 ). Estimated heritabilities 
were similar for all models and ranged from 0.29 
(SEXTLM) to 0.35 (TlogLWM) for the CON score, and 
from 0.21 (SHTLM) to 0.25 (TLM) for the FAT score 
(Table 5). These heritabilities estimates indicate that a 
sizeable fraction of the variance is additive genetic and 
confirmed that the results obtained were within the 
range of estimates from previous studies for the same 
subjective traits in other populations evaluated with the 
EUROP system [1,2,5,20]. 

The heterogeneity of the models described above had 
a marked impact on the prediction of EBV. For thresh- 
old linear models, the correlations were over 0.98 for 
CON and 0.95 for FAT scores between EBV from TLM 
and SEXTLM (Figures 1 and 2), much higher than the 
results of Varona et al. [5]. For grouped data models, 
the correlations were over 0.98 for CON and 0.96 for 
FAT scores between EBV from TlogLWM and GDM. 



Table 5 Heritability estimates for carcass conformation 
and fat cover 







TLM 


SHTLM 


SEXTLM 


TlogLWM 


GDM 


CON 




0.344 


1.206 


1.668 


0.621 


0.609 




a h 2 


0.089 


0.548 


0.735 


0.180 


0.180 






0.666 


2.304 


3.238 


1 


1 




h 2 


0.313 


0.300 


0.291 


0.345 


0.340 


FAT 


a u 2 


0.092 


0.131 


0.144 


0.306 


0.306 




a h 2 


0.037 


0.063 


0.088 


0.151 


0.170 




a e 2 


0.245 


0.451 


0.454 


1 


1 




h 2 


0.245 


0.205 


0.207 


0.210 


0.207 



Estimated additive (a u 2 ), herd (a h 2 ) and error (a e 2 ) variances and heritabilities 
(h 2 ) for carcass conformation (CON) and fat cover (FAT) under a threshold 
linear model (TLM), a specific slaughterhouse threshold linear model (SHTLM), 
a specific sex per slaughterhouse threshold linear model (SEXTLM), a 
threshold log linear Weibull model (TlogLWM), and a grouped data model 
(GDM). 
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Figure 1 Bivariate plot of estimated breeding values for 
carcass conformation. Comparison of the threshold linear model 
and the specific sex by slaughterhouse threshold linear model 



Correlations between EBV from SEXTLM and GDM 
dropped to around minus 0.90 (Figures 3 and 4) because 
the assumptions made in both models were different. 
Whereas SEXTLM assumes that the effect of the EBV is 
additive on the underlying variable, a GDM assumes 
that the effect of the EBV is exponentiated to multiply 
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Figure 2 Bivariate plot of estimated breeding values for fat 

cover. Comparison of the threshold linear model and the specific 
sex by slaughterhouse threshold linear model 
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SEXTLM 

Figure 3 Bivariate plot of estimated breeding values for 
carcass conformation. Comparison of the specific sex by 
slaughterhouse threshold linear model and the grouped data model 



the underlying variable by some constant. The correla- 
tions between EBV from SEXTLM and GDM were 
negative because a negative EBV for an animal in GDM 
meant higher CON and FAT scores, e.g. an EBV of 
-0.20 meant exp(-(-0.20)) = 1.22 times higher perfor- 
mance. However, although the prediction of EBV was 




-l J 

SEXTLM 

Figure 4 Bivariate plot of estimated breeding values for fat 

cover. Comparison of the specific sex by slaughterhouse threshold 
linear model and the grouped data model 
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different, both models can be used to analyse CON and 
FAT scores with a correct goodness-of-fit. Therefore, 
there is a need for an appropriate procedure, e.g. predic- 
tive ability criteria, to rank models properly for a better 
choice of the model for genetic evaluation. 

Conclusions 

Significant fitting deficiencies were revealed when ana- 
lyzing carcass conformation and fat cover scores using a 
threshold linear model with homogeneous thresholds. 
When a specific sex by slaughterhouse threshold model 
was considered, the fitting deficiencies were solved. 
Similar results were also obtained when heterogeneous 
thresholds were assumed in grouped data models that 
estimate score-dependent sex and slaughterhouse effects. 
The estimated heritabilities obtained from all models 
indicated that a sizeable fraction of the variance of both 
traits was additive genetic. Besides a goodness-of-fit pro- 
cedure such as the one used in this work, an appropriate 
procedure, e.g. predictive ability criteria, to rank models 
properly for genetic evaluation in large field applications 
is needed. 

List of abbreviations used 

CON: carcass conformation; EBV: estimated breeding values; FAT: fat cover; 
GDM: grouped data model; SEXTLM: specific sex per slaughterhouse 
threshold linear model; SHTLM: specific slaughterhouse threshold linear 
model; TLM: threshold linear model; TlogLWM: threshold log-linear Weibull 
model. 
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